Statistical Commission and Conference of European Statisticians Unece Work Session on Statistical Data Editing Development of Modern Edit and Imputation Methods at Statistics Netherlands Contributed Paper Submitted by Statistics Netherlands

ثبت نشده
چکیده

Topic (iv): Impact of new technologies on statistical data editing Abstract: The development of modern edit and imputation (E&I) methods and software is one of the spearheads of the Methods and Informatics Department of Statistics Netherlands. Many aspects of E&I are covered by the work that is currently being carried out. Software development focuses on the further development of SLICE, a general software framework for automatic E&I. At the moment, SLICE is being extended with the new Cherry Pie module for automatic error localisation in a mixture of categorical and continuous data, with a module for regression imputation of missing continuous data, and with the EC System module for modification of imputed values so that all user-specified edits become satisfied. Besides SLICE, the WAID program for donor imputation of missing values is being developed further. Finally, a software tool for graphical macro-editing based on the functionality offered by SPSS enhanced with Visual Basic modules that are integrated into the SPSS environment is being developed. Methodological research focuses on selective editing and automatic E&I. The merits of classification/regression trees and (logistic) regression for selective editing purposes are currently being investigated. For automatic E&I an ambitious research project is planned to start early 2002. This project will last four years. Aim of this research project is to develop an approach for automatic E&I that integrates the approach based on the Fellegi-Holt paradigm and the NIM approach. Like NIM our approach will use 'predicted' values to locate and correct errors. In contrast to the NIM, the approach will not be based on hot-deck donor imputation exclusively, but will also allow other types of imputation. If no imputation method is specified, the new approach will be the same as the approach based on the Fellegi-Holt paradigm. In this paper the above-mentioned development of E&I methods and software at Statistics Netherlands is described in some detail.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WP. 18 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS

A uniform statistical process that uses modern methodological editing techniques has been implemented at Statistics Netherlands for annual structural business statistics and the short-term (turnover) statistics. Parallel to the redesign of the process the organisation of the Business Statistics Division has been changed. The implementation of a general uniform process caused new internal depend...

متن کامل

Modernization of Official Statistics, Its Concepts and Main Components

The great progress of technology and science make societies more complicated. In this regard, budget constraints as well as the growing statistical needs of users for high quality and timely statistics, have faced national statistical offices (NSOs) with many new challenges in recent years. In other hand, emergence of multiple data sources has increased the demand of these factors, together for...

متن کامل

United Nations Statistical Commission and European Commission Economic Commission for Europe Statistical Office of the Conference of European Statisticians European Communities (eurostat) Joint Ece/eurostat Work Session on Statistical Data Confidentiality Balancing Data Quality and Confidentiality for Tabular Data Invited Paper

1. Tabular data are the earliest form and remain a staple of official statistics data products. Familiar examples of tabular data products in official statistics include count data such as age-race-sex and other demographic data, concentration (or percentage) data such in financial or energy utilization statistics, and magnitude data such as total retail sales or air pollution data. Confidentia...

متن کامل

WP No. 26 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS

This paper describes a method for imputation in general contingency tables when the imputations are subject to both analytic (edit) constraints and probabilistic distributional constraints. The model extends edit ideas in Fellegi and Holt (1976) and Winkler and Chen (2002). The model extends missing-at-random imputation ideas in Little and Rubin (1987). Some of the ideas are related to Friedman...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002